Questions:

Does a model derived gradient between wetlands and uplands correspond to a gradient of soil development along a catena/hillslope characterized in a hydropedological framework?

If the ends of the wetland/upland classification are accurate, does the intermediate range correspond with hypothesized soil formation characteristics according to a hydropedological framework?

hydropedology hillslope model from Bailey et al., 2014
hydropedology hillslope model from Bailey et al., 2014

The Hubbard Brook Experimental Forest

From Hubbard Brook website https://hubbardbrook.org/about-the-forest/:

Hydropedological units in the Hubbard Brook Experimental Forest include:

Gather and filter NWI data

## [1] "ATTRIBUTE"  "WETLAND_TY" "ACRES"      "Shape_Leng" "Shape_Area"
## SpatRaster object downsampled to 775 by 1291 cells.

Gather Hubbard Brook hydrography

Get training data - wetland and upland points

Total area is 5816.7781 ha Wetland area for training data is 66.63021 ha (~10%)

Non-wetland areas or “upland” areas have removed wetland/wet areas with a 5m buffer to reduce potential mis-classification. Some wet areas are not identified yet, which means some potential non-wetland areas may contain wet areas. Making the non-wetland point dataset larger can offset this a bit

## [1] 1947

## SpatRaster object downsampled to 775 by 1291 cells.
## [1] 530
## SpatRaster object downsampled to 775 by 1291 cells.

Make terrain metrics

hbslp_3 <- SlpAsp(hbdem, w = c(3, 3), metrics = "slope", filename = "UplandWetlandGradient/data/derived_data/hbslp_3.tif",
    overwrite = T)

hbslp_27 <- SlpAsp(hbdem, w = c(27, 27), metrics = "slope", filename = "UplandWetlandGradient/data/derived_data/hbslp_27.tif",
    overwrite = T)

hbslp_81 <- SlpAsp(hbdem, w = c(81, 81), metrics = "slope", filename = "UplandWetlandGradient/data/derived_data/hbslp_81.tif",
    overwrite = T)
hbtpi_3 <- TPI(hbdem, w = c(3, 3), shape = "rectangle", stand = "none",
    na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbtpi_3.tif",
    overwrite = T)

hbtpi_27 <- TPI(hbdem, w = c(27, 27), shape = "rectangle", stand = "none",
    na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbtpi_27.tif",
    overwrite = T)

hbtpi_81 <- TPI(hbdem, w = c(81, 81), shape = "rectangle", stand = "none",
    na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbtpi_81.tif",
    overwrite = T)
hbcurv_3 <- Qfit(hbdem, w = c(3, 3), metrics = c("meanc", "profc",
    "planc"), unit = "degrees", na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbcurv_3.tif",
    overwrite = T)

hbcurv_27 <- Qfit(hbdem, w = c(27, 27), metrics = c("meanc",
    "profc", "planc"), unit = "degrees", na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbcurv_27.tif",
    overwrite = T)

hbcurv_81 <- Qfit(hbdem, w = c(81, 81), metrics = c("meanc",
    "profc", "planc"), unit = "degrees", na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbcurv_81.tif",
    overwrite = T)
hbasp_3 <- Qfit(hbdem, w = c(3, 3), metrics = c("qaspect"), unit = "degrees",
    na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbasp_3.tif",
    overwrite = T)
hbhli_3 <- spatialEco::hli(hbasp_3)

hbasp_27 <- Qfit(hbdem, w = c(27, 27), metrics = c("qaspect"),
    unit = "degrees", na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbasp_27.tif",
    overwrite = T)
hbhli_27 <- spatialEco::hli(hbasp_27)

hbasp_81 <- Qfit(hbdem, w = c(81, 81), metrics = c("qaspect"),
    unit = "degrees", na.rm = TRUE, filename = "UplandWetlandGradient/data/derived_data/hbasp_81.tif",
    overwrite = T)
hbhli_81 <- spatialEco::hli(hbasp_81)

Slope at different scales

TPI at different scales

Curvature at different scales

Aspect/HLI at different scales

Whitebox tools Deviation from mean elevation

Is resampling the same as setting the window size larger?

## [1] "class" "dem1m" "meanc" "profc" "planc"
##  class       : SpatVector 
##  geometry    : points 
##  dimensions  : 1495, 5  (geometries, attributes)
##  extent      : 275091.5, 283296.5, 4866362, 4871276  (xmin, xmax, ymin, ymax)
##  coord. ref. : NAD83 / UTM zone 19N (EPSG:26919) 
##  names       : class dem1m      meanc      profc     planc
##  type        : <chr> <num>      <num>      <num>     <num>
##  values      :   UPL 688.7  -0.001151 -0.0001594 -0.002142
##                  UPL   584   0.001018   0.001373 0.0006639
##                  UPL 581.4 -0.0007899  -0.001685 0.0001053

Hydrology metrics

Hydrologic conditioning

D-infinity flow accumulation

Topographic Wetness Index

Depth to water from cost distance function

Stack rasters and use point data for extraction

Set up random forest model

## randomForest 4.7-1.1
## Type rfNews() to see new features/changes/bug fixes.
## 
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
## 
##     combine
## The following object is masked from 'package:spatialEco':
## 
##     combine
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.3.1
## 
## Attaching package: 'ggplot2'
## The following object is masked from 'package:randomForest':
## 
##     margin
## Loading required package: lattice
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction UPL WET
##        UPL 711  33
##        WET  17 326
##                                           
##                Accuracy : 0.954           
##                  95% CI : (0.9398, 0.9657)
##     No Information Rate : 0.6697          
##     P-Value [Acc > NIR] : < 2e-16         
##                                           
##                   Kappa : 0.8948          
##                                           
##  Mcnemar's Test P-Value : 0.03389         
##                                           
##             Sensitivity : 0.9766          
##             Specificity : 0.9081          
##          Pos Pred Value : 0.9556          
##          Neg Pred Value : 0.9504          
##              Prevalence : 0.6697          
##          Detection Rate : 0.6541          
##    Detection Prevalence : 0.6845          
##       Balanced Accuracy : 0.9424          
##                                           
##        'Positive' Class : UPL             
## 
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction UPL WET
##        UPL 274  24
##        WET  24 145
##                                          
##                Accuracy : 0.8972         
##                  95% CI : (0.866, 0.9232)
##     No Information Rate : 0.6381         
##     P-Value [Acc > NIR] : <2e-16         
##                                          
##                   Kappa : 0.7775         
##                                          
##  Mcnemar's Test P-Value : 1              
##                                          
##             Sensitivity : 0.9195         
##             Specificity : 0.8580         
##          Pos Pred Value : 0.9195         
##          Neg Pred Value : 0.8580         
##              Prevalence : 0.6381         
##          Detection Rate : 0.5867         
##    Detection Prevalence : 0.6381         
##       Balanced Accuracy : 0.8887         
##                                          
##        'Positive' Class : UPL            
## 
## [1] 29.776

AUC/ROC

## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var
## 
## Attaching package: 'ROCR'
## The following object is masked from 'package:performance':
## 
##     performance
## [1] "Area under the ROC curve"
## [[1]]
## [1] 0.8887256

## [1] 0.8972163
## [1] 0.9194631
## Warning in confusionMatrix.default(as.factor(test_df$wetupl),
## as.factor(test_df$test_class)): Levels are not in the same order for reference
## and data. Refactoring data to match.
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

PDF/kernel density plot

Boosted Regression Trees

##                       var     rel_inf
## dev81               dev81 56.56816162
## slp_27             slp_27 11.98661984
## slp_81             slp_81  5.78882225
## dem                   dem  5.41586322
## dev27               dev27  4.37506444
## plan_curv_27 plan_curv_27  2.91163285
## meancurv_81   meancurv_81  2.72417947
## plan_curv_81 plan_curv_81  2.21375416
## tpi_27             tpi_27  1.76460306
## slp_3               slp_3  1.10757720
## plan_curv_3   plan_curv_3  0.92665550
## prof_curv_81 prof_curv_81  0.91851155
## tpi_81             tpi_81  0.77612130
## prof_curv_27 prof_curv_27  0.72856500
## meancurv_27   meancurv_27  0.31287102
## hli_81             hli_81  0.29643587
## hb_sca_log     hb_sca_log  0.28151554
## hli_3               hli_3  0.22607859
## twi                   twi  0.22502912
## hli_27             hli_27  0.10857315
## tpi_3               tpi_3  0.10689530
## meancurv_3     meancurv_3  0.09895650
## prof_curv_3   prof_curv_3  0.08669080
## dev3                 dev3  0.05082266
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction UPL WET
##        UPL 269  24
##        WET  29 145
##                                           
##                Accuracy : 0.8865          
##                  95% CI : (0.8542, 0.9138)
##     No Information Rate : 0.6381          
##     P-Value [Acc > NIR] : <2e-16          
##                                           
##                   Kappa : 0.7558          
##                                           
##  Mcnemar's Test P-Value : 0.5827          
##                                           
##             Sensitivity : 0.9027          
##             Specificity : 0.8580          
##          Pos Pred Value : 0.9181          
##          Neg Pred Value : 0.8333          
##              Prevalence : 0.6381          
##          Detection Rate : 0.5760          
##    Detection Prevalence : 0.6274          
##       Balanced Accuracy : 0.8803          
##                                           
##        'Positive' Class : UPL             
## 

Predict maps

## SpatRaster object downsampled to 775 by 1291 cells.
## SpatRaster object downsampled to 775 by 1291 cells.

## # A tibble: 3 × 2
##   kmeanclus meanWET
##   <fct>       <dbl>
## 1 1           0.544
## 2 2           0.266
## 3 3           0.119

Histogram of raster

## Warning: [hist] a sample of 17% of the cells was used (of which 47% was NA)